Inside Airbnb contains publicly available data on listings. The website is independent and not endorsed by Airbnb and is not affiliated with Airbnb's competitors.
Airbnb does not have a publicly available database. Inside Airbnb is the closest approximation to the real data from Airbnb. The data files are sourced from the Airbnb web-site and provides the main metrics relevant to examining the performance of Airbnb in major cities around the world
The data used here is based on Airbnb activity in Shanghai.
The main purpose here is to pre-process the data, particularly the text data in reviews to provide some interactive visual analysis on the Shanghai data with some additional analysis on pricing.
# Load the libraries and data files
import pandas as pd
import numpy as np
import gzip
import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
with gzip.open('calendar.csv.gz') as C:
c = pd.read_csv(C)
with gzip.open('listings.csv.gz') as l:
LD = pd.read_csv(l)
LS = pd.read_csv('SH_listings_summary.csv')
NB = pd.read_csv('SH_neighbourhoods.csv')
RV = pd.read_csv('SH_reviews.csv')
c.head()
C:\Users\ke117\Anaconda3\lib\site-packages\dask\config.py:168: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. C:\Users\ke117\Anaconda3\lib\site-packages\distributed\config.py:20: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
| listing_id | date | available | price | adjusted_price | minimum_nights | maximum_nights | |
|---|---|---|---|---|---|---|---|
| 0 | 1950925 | 2020-06-21 | t | $1,709.00 | $1,709.00 | 1.0 | 1125.0 |
| 1 | 24963 | 2020-06-21 | t | $495.00 | $495.00 | NaN | NaN |
| 2 | 24963 | 2020-06-22 | f | $495.00 | $495.00 | NaN | NaN |
| 3 | 24963 | 2020-06-23 | f | $495.00 | $495.00 | 3.0 | 365.0 |
| 4 | 24963 | 2020-06-24 | t | $495.00 | $495.00 | 3.0 | 365.0 |
print('Number of unique properties:', c.listing_id.nunique(), 'listings', '&',
'Number of dates:', c.date.nunique(), 'dates')
print('First and last date:', c.date.min(), '&', c.date.max())
Number of unique properties: 41415 listings & Number of dates: 372 dates First and last date: 2020-06-20 & 2021-06-26
list(LS)
['id', 'name', 'host_id', 'host_name', 'neighbourhood_group', 'neighbourhood', 'latitude', 'longitude', 'room_type', 'price', 'minimum_nights', 'number_of_reviews', 'last_review', 'reviews_per_month', 'calculated_host_listings_count', 'availability_365']
# rename the listing id column in LS to merge with c id
LS.rename(columns = {'id':'listing_id'}, inplace = True)
# remove common columns
LS.drop(columns = {'price',
'minimum_nights'}, inplace = True)
# number of unique listings is the same in both LS and c
m1 = pd.merge(c, LS, on = 'listing_id', how = 'left')
m1.listing_id.nunique()
41415
m1.head(1)
| listing_id | date | available | price | adjusted_price | minimum_nights | maximum_nights | name | host_id | host_name | neighbourhood_group | neighbourhood | latitude | longitude | room_type | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1950925 | 2020-06-21 | t | $1,709.00 | $1,709.00 | 1.0 | 1125.0 | Selected two bedroom apartment | 10044315 | Sophy | NaN | 黄浦区 / Huangpu District | 31.22776 | 121.48164 | Private room | 25 | 2017-05-29 | 0.33 | 6 | 365 |
list(m1)
['listing_id', 'date', 'available', 'price', 'adjusted_price', 'minimum_nights', 'maximum_nights', 'name', 'host_id', 'host_name', 'neighbourhood_group', 'neighbourhood', 'latitude', 'longitude', 'room_type', 'number_of_reviews', 'last_review', 'reviews_per_month', 'calculated_host_listings_count', 'availability_365']
m1.neighbourhood.value_counts()
浦东新区 / Pudong 5396239 黄浦区 / Huangpu District 2127077 徐汇区 / Xuhui District 1704201 静安区 / Jing'an District 1059513 闵行区 / Minhang District 934568 长宁区 / Changning District 630881 松江区 / Songjiang District 476745 杨浦区 / Yangpu District 476481 虹口区 / Hongkou District 433370 崇明区 / Chongming District 401503 青浦区 / Qingpu District 401160 普陀区 / Putuo District 353755 嘉定区 / Jiading District 294946 宝山区 / Baoshan District 276725 奉贤区 / Fengxian District 86505 金山区 / Jinshan District 65700 Name: neighbourhood, dtype: int64
# Remove the $ sign in price, change to appropriate type (numeric / float)
m1.price = m1.price.str.replace('$', '') # replace with empty space
m1.price = m1.price.str.replace(',', '')
m1.price = pd.to_numeric(m1.price)
# Convert date object to datetime format
m1.date = pd.to_datetime(m1.date)
# Extract separately from date
m1['day_of_week'] = pd.DatetimeIndex(m1.date).weekday_name
m1['day_of_month'] = pd.DatetimeIndex(m1.date).day
m1['year'] = pd.DatetimeIndex(m1.date).year
m1['month'] = pd.DatetimeIndex(m1.date).month
m1['day_of_week'].value_counts()
Monday 2169643 Sunday 2164684 Tuesday 2162965 Saturday 2159658 Thursday 2155263 Wednesday 2153578 Friday 2153578 Name: day_of_week, dtype: int64
p = m1.groupby(['month', 'room_type'],
as_index = False)['price'].mean()
px.scatter(p,
x = 'month', y = 'price',
color = 'room_type',
trendline = 'lowess')
p = m1.groupby(['day_of_week', 'room_type'],
as_index = False)['price'].mean()
px.bar(p,
x = 'day_of_week', y = 'price',
color = 'room_type')
p = m1.groupby(['day_of_month', 'room_type'],
as_index = False)['price'].mean()
px.line(p,
x = 'day_of_month',
y = 'price',
color = 'room_type')
p = m1.groupby(['month', 'neighbourhood'],
as_index = False)['price'].mean()
px.scatter(p,
x = 'month', y = 'price',
color = 'neighbourhood',
trendline = 'lowess')
p = m1.groupby(['day_of_month', 'neighbourhood'],
as_index = False)['price'].mean()
px.line(p,
x = 'day_of_month', y = 'price',
color = 'neighbourhood')
p = m1.groupby(['day_of_week', 'neighbourhood'],
as_index = False)['price'].mean()
px.bar(p,
x = 'day_of_week', y = 'price',
color = 'neighbourhood')
days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday',
'Saturday', 'Sunday']
w = m1[['day_of_week', 'price']]
# Average price
w = w.groupby(['day_of_week']).mean().reindex(days)
w = w.reset_index()
px.line(w,
x = 'day_of_week',
y = 'price')
months = [1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11 , 12]
m = m1[['month', 'price']]
m = m1[['month', 'price']]
m = m.groupby(['month']).mean().reindex(months)
m = m.reset_index()
px.line(m,
x = 'month',
y = 'price')
m1.head()
| listing_id | date | available | price | adjusted_price | minimum_nights | maximum_nights | name | host_id | host_name | ... | room_type | number_of_reviews | last_review | reviews_per_month | calculated_host_listings_count | availability_365 | day_of_week | day_of_month | year | month | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1950925 | 2020-06-21 | t | 1709.0 | $1,709.00 | 1.0 | 1125.0 | Selected two bedroom apartment | 10044315 | Sophy | ... | Private room | 25 | 2017-05-29 | 0.33 | 6 | 365 | Sunday | 21 | 2020 | 6 |
| 1 | 24963 | 2020-06-21 | t | 495.0 | $495.00 | NaN | NaN | Heart of French Concession / Home | 98203 | Jia | ... | Entire home/apt | 84 | 2019-11-22 | 0.71 | 2 | 2 | Sunday | 21 | 2020 | 6 |
| 2 | 24963 | 2020-06-22 | f | 495.0 | $495.00 | NaN | NaN | Heart of French Concession / Home | 98203 | Jia | ... | Entire home/apt | 84 | 2019-11-22 | 0.71 | 2 | 2 | Monday | 22 | 2020 | 6 |
| 3 | 24963 | 2020-06-23 | f | 495.0 | $495.00 | 3.0 | 365.0 | Heart of French Concession / Home | 98203 | Jia | ... | Entire home/apt | 84 | 2019-11-22 | 0.71 | 2 | 2 | Tuesday | 23 | 2020 | 6 |
| 4 | 24963 | 2020-06-24 | t | 495.0 | $495.00 | 3.0 | 365.0 | Heart of French Concession / Home | 98203 | Jia | ... | Entire home/apt | 84 | 2019-11-22 | 0.71 | 2 | 2 | Wednesday | 24 | 2020 | 6 |
5 rows × 24 columns
m1.price.describe()
count 1.511108e+07 mean 6.248737e+02 std 1.799159e+03 min 0.000000e+00 25% 2.290000e+02 50% 3.580000e+02 75% 5.550000e+02 max 7.066500e+04 Name: price, dtype: float64
import seaborn as sns
fig, (ax1, ax2) = plt.subplots(2, 1,
figsize = (20, 10))
sns.boxplot([m1['price']],
ax = ax1,
color = 'teal').set_title('Price')
m1['price'].plot(kind = 'hist',
bins = 70,
xlim = (0, 9000),
ax = ax2,
color = 'crimson')
<matplotlib.axes._subplots.AxesSubplot at 0x1a445765278>
where t = Available and f = Not available
m2 = m1[['date', 'available']]
m2['% booked'] = m2['available'].map(lambda x: 0 if x == 't' else 1)
m2 = m2.groupby('date')['% booked'].mean().reset_index()
# % of listings booked
px.line(m2,
x = 'date',
y = '% booked')
m1.room_type.value_counts()
Entire home/apt 8538959 Private room 6094847 Shared room 483373 Hotel room 2190 Name: room_type, dtype: int64
m3 = m1[['date', 'available', 'room_type']]
m3['% booked'] = m3['available'].map(lambda x: 0 if x == 't' else 1)
m3 = m3.groupby(['date', 'room_type'])['% booked'].mean().reset_index()
px.line(m3,
x = 'date',
y = '% booked',
color = 'room_type')
LD['price'] = LD['price'].str.replace('$', '')
LD['price'] = LD['price'].str.replace(',', '')
LD['price'] = pd.to_numeric(LD['price'])
LD = LD.loc[LD['price'] < 150000]
#m4 = LD.groupby('property_type', as_index = False)['price'].mean()
px.box(LD,
x = 'property_type', y = 'price')
px.box(LD,
x = 'bed_type',
y = 'price')
px.box(LD,
x = 'beds', y = 'price')
px.histogram(LD, x = 'beds')
LD.amenities.value_counts()
{TV,Wifi,"Air conditioning",Kitchen,Heating,Washer,Dryer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private entrance"} 220
{TV,Wifi,"Air conditioning",Kitchen,Heating,Washer,Dryer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace"} 184
{} 113
{TV,Wifi,"Air conditioning",Kitchen,Heating,Washer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private entrance"} 82
{TV,Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace"} 80
{TV,Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace"} 58
{TV,Wifi,"Air conditioning",Essentials,Shampoo,"Hair dryer"} 57
{"Air conditioning",Kitchen,"Smoking allowed",Elevator,"Suitable for events",Washer,"Smoke alarm","Carbon monoxide alarm","Fire extinguisher",Hangers,"Hair dryer","Laptop-friendly workspace","Private entrance","Hot water"} 54
{TV,Wifi,"Air conditioning","Free parking on premises","Smoking allowed","Suitable for events","Smoke alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door","Hair dryer","Laptop-friendly workspace","Private living room","Hot water","Bed linens","Ethernet connection"} 51
{TV,Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace","Hot water"} 49
{TV,Wifi,"Air conditioning",Heating,Washer,Dryer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace"} 34
{Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace"} 34
{Internet,Wifi,"Air conditioning","Free parking on premises",Heating,Washer,"Smoke alarm","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Self check-in","Building staff"} 34
{TV,Wifi,"Air conditioning","Free parking on premises",Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Private entrance"} 34
{TV,Wifi,"Air conditioning",Washer,"First aid kit","Fire extinguisher",Essentials,"Lock on bedroom door","Hair dryer","Private entrance"} 33
{TV,Wifi,"Air conditioning","Free parking on premises","Pets allowed","Suitable for events",Washer,"Smoke alarm","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door","Hair dryer","Laptop-friendly workspace"} 32
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Smoking allowed","Indoor fireplace",Heating,Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private entrance"} 32
{TV,Wifi,"Air conditioning","Free parking on premises",Elevator,"Smoke alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door","Hair dryer","Laptop-friendly workspace","Self check-in","Building staff","Private entrance"} 31
{TV,Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,"Hair dryer"} 27
{Wifi,"Air conditioning",Kitchen,"Paid parking off premises",Washer,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Private living room","Private entrance","Hot water","Bed linens",Microwave,Refrigerator,Stove,"Long term stays allowed","Host greets you"} 27
{TV,Wifi,"Air conditioning","First aid kit","Fire extinguisher",Essentials,"Lock on bedroom door","Hair dryer","Private entrance"} 26
{Wifi,"Air conditioning",Kitchen,"Smoking allowed","Pets allowed",Gym,Elevator,Heating,"Suitable for events",Washer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher","Laptop-friendly workspace","Private entrance","Hot water"} 24
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises",Washer,"Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private living room","Private entrance","Hot water"} 24
{TV,Wifi,"Air conditioning",Kitchen,Washer,"Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door","Hair dryer","Private entrance"} 23
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises",Breakfast,Heating,"Suitable for events",Washer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Private living room","Private entrance"} 23
{Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,Hangers,"Hair dryer",Iron} 22
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Pets allowed",Breakfast,"Suitable for events",Washer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door","Hair dryer","Laptop-friendly workspace","Private living room","Private entrance"} 22
{TV,Wifi,"Air conditioning",Kitchen,Washer,Essentials,Shampoo,Hangers,"Hair dryer","Private entrance","Hot water"} 20
{TV,Wifi,"Air conditioning",Kitchen,Elevator,Washer,Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace","Hot water"} 20
{TV,Wifi,"Air conditioning",Kitchen,"Smoking allowed","Pets allowed",Gym,Elevator,Heating,"Suitable for events",Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Hangers,"Laptop-friendly workspace","Private entrance","Hot water"} 20
...
{TV,Wifi,"Air conditioning",Kitchen,"Family/kid friendly","Suitable for events",Washer,Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in",Keypad,"Private entrance","Hot water","Bed linens",Microwave,Refrigerator,"Dishes and silverware","Cooking basics",Stove,"Long term stays allowed","Paid parking on premises"} 1
{Wifi,"Air conditioning",Kitchen,"Free parking on premises","Smoking allowed","Family/kid friendly",Washer,"Safety card",Essentials,"Lock on bedroom door",Hangers,"translation missing: en.hosting_amenity_50"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Paid parking off premises","Pets allowed",Washer,"Smoke alarm","Carbon monoxide alarm",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Private living room","Private entrance","Hot water","Bed linens","Ethernet connection",Microwave,Refrigerator,"Single level home","Luggage dropoff allowed","Long term stays allowed","Host greets you","Paid parking on premises"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Smoking allowed",Washer,"Smoke alarm","Carbon monoxide alarm","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in","Smart lock","Private entrance"} 1
{Wifi,"Air conditioning",Kitchen,"Smoking allowed","Pets allowed",Elevator,Washer,"Carbon monoxide alarm","Fire extinguisher","Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in",Keypad,"Hot water"} 1
{Wifi,"Air conditioning",Kitchen,Heating,Washer,Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Private living room","Hot water","Bed linens","Extra pillows and blankets",Refrigerator,"Dishes and silverware","Cooking basics",Oven,Stove,"Luggage dropoff allowed","Host greets you","Paid parking on premises"} 1
{Wifi,"Air conditioning",Kitchen,"Smoking allowed","Pets allowed","Suitable for events",Washer,Essentials,Shampoo,Hangers,"Hair dryer","Self check-in",Keypad,"Hot water"} 1
{Wifi,"Air conditioning",Kitchen,"Smoking allowed",Heating,Washer,Dryer,Essentials,Shampoo,"Hair dryer","Private entrance"} 1
{TV,"Cable TV",Wifi,"Air conditioning","Wheelchair accessible",Kitchen,"Smoking allowed",Doorman,Breakfast,Heating,"Suitable for events",Washer,Dryer,Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private living room"} 1
{TV,"Cable TV",Wifi,"Air conditioning","Wheelchair accessible",Kitchen,"Free parking on premises","Smoking allowed",Doorman,Elevator,"Hot tub","Buzzer/wireless intercom",Heating,Washer,"First aid kit","Safety card",Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace"} 1
{TV,"Cable TV",Wifi,"Air conditioning",Kitchen,"Free parking on premises","Free street parking","Suitable for events",Washer,"Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Window guards","Hot water","Bed linens","Extra pillows and blankets","Ethernet connection",Microwave,"Coffee maker",Refrigerator,"Dishes and silverware","Cooking basics",Stove,"Patio or balcony","Garden or backyard","Luggage dropoff allowed","Long term stays allowed","Host greets you","Lake access"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Smoking allowed","Pets allowed","Suitable for events",Washer,"Smoke alarm","Carbon monoxide alarm",Essentials,Shampoo,Hangers,"Hair dryer","Hot water"} 1
{TV,"Cable TV",Internet,Wifi,"Air conditioning","Wheelchair accessible",Kitchen,"Free parking on premises","Paid parking off premises","Pets allowed",Gym,Breakfast,"Free street parking",Heating,Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Safety card","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","High chair","Children’s books and toys","Hot water","Luggage dropoff allowed","Long term stays allowed","Host greets you","Full kitchen"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Family/kid friendly","Suitable for events","Smoke alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Self check-in",Keypad,"Luggage dropoff allowed","Long term stays allowed","Hot water kettle"} 1
{Wifi,"Air conditioning",Elevator,Washer,"Smoke alarm",Essentials,Shampoo,"Hair dryer","Laptop-friendly workspace","Hot water"} 1
{TV,Internet,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Smoking allowed",Gym,Breakfast,Elevator,Heating,"Family/kid friendly",Washer,"Smoke alarm","First aid kit","Safety card","Fire extinguisher",Essentials,Shampoo,"24-hour check-in",Hangers,"Hair dryer",Iron} 1
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises",Breakfast,Heating,Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Self check-in","Smart lock","Private living room","Private entrance",Bathtub,"High chair","Children’s books and toys",Crib,"Pack ’n Play/travel crib","Room-darkening shades","Hot water","Bed linens","Extra pillows and blankets",Microwave,"Coffee maker",Refrigerator,"Dishes and silverware","Cooking basics",Oven,Stove,"Luggage dropoff allowed","Long term stays allowed","Cleaning before checkout","Paid parking on premises","Hand Sanitiser","Household Disinfectant","Antibacterial solutions",Thermometer,"Disposable gloves"} 1
{TV,"Cable TV",Wifi,"Air conditioning",Pool,Kitchen,"Free parking on premises",Gym,Breakfast,Elevator,"Hot tub","Indoor fireplace",Heating,"Family/kid friendly",Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private living room","Baby monitor","Outlet covers",Bathtub,"Baby bath","Changing table","High chair","Stair gates","Children’s books and toys","Window guards","Table corner guards","Fireplace guards","Babysitter recommendations",Crib,"Pack ’n Play/travel crib","Room-darkening shades","Children’s dinnerware","Game console","Hot water","Bed linens","Ethernet connection","Luggage dropoff allowed","Long term stays allowed","Host greets you"} 1
{TV,"Cable TV",Wifi,"Air conditioning",Kitchen,"Smoking allowed","Pets allowed",Doorman,Breakfast,"Indoor fireplace","Buzzer/wireless intercom",Heating,Washer,"Safety card","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Private living room","Private entrance"} 1
{TV,"Cable TV",Wifi,"Air conditioning",Kitchen,"Paid parking off premises",Washer,Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in","Smart lock","Private entrance","Window guards","Hot water","Bed linens","Extra pillows and blankets","Ethernet connection",Refrigerator,"Dishes and silverware","Cooking basics",Stove,"Patio or balcony","Luggage dropoff allowed","Long term stays allowed","Cleaning before checkout","Paid parking on premises"} 1
{Wifi,"Air conditioning","Free parking on premises","Paid parking off premises","Smoking allowed","Pets live on this property",Dog(s),"Free street parking","Family/kid friendly",Washer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in","Smart lock","Private living room","Private entrance","Hot water","Paid parking on premises"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Paid parking off premises","Smoking allowed","Pets allowed",Breakfast,Heating,"Suitable for events",Washer,"Smoke alarm",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer","Laptop-friendly workspace","Private entrance","Hot water","Luggage dropoff allowed","Long term stays allowed","Host greets you"} 1
{TV,"Cable TV",Wifi,"Air conditioning",Kitchen,"Free parking on premises","Paid parking off premises",Elevator,"Free street parking",Heating,"Suitable for events",Washer,"First aid kit","Fire extinguisher",Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in","Smart lock","Outlet covers","Window guards","Room-darkening shades","Hot water","Bed linens","Extra pillows and blankets","Ethernet connection",Microwave,Refrigerator,"EV charger","Luggage dropoff allowed","Long term stays allowed","Paid parking on premises"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Free parking on premises","Smoking allowed","Suitable for events",Washer,Essentials,Shampoo,Hangers,"Laptop-friendly workspace"} 1
{Internet,Wifi,"Air conditioning",Kitchen,"Smoking allowed",Breakfast,"Pets live on this property",Dog(s),"Family/kid friendly","Suitable for events",Washer,Dryer,"First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door","Hair dryer","translation missing: en.hosting_amenity_50"} 1
{TV,"Cable TV",Wifi,"Air conditioning","Wheelchair accessible",Kitchen,"Free parking on premises",Doorman,Breakfast,"Hot tub","Suitable for events",Washer,Dryer,"Carbon monoxide alarm","First aid kit","Safety card","Fire extinguisher",Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace"} 1
{TV,Wifi,"Air conditioning",Pool,Kitchen,"Free parking on premises",Breakfast,Heating,Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,"Lock on bedroom door",Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","High chair","Stair gates","Children’s books and toys","Table corner guards","Room-darkening shades","Children’s dinnerware",Microwave,"Coffee maker",Refrigerator,"Dishes and silverware","Cooking basics",Oven,"Patio or balcony","Garden or backyard","Luggage dropoff allowed","Long term stays allowed","Cleaning before checkout","Wide entrance for guests","Flat path to guest entrance","Well-lit path to entrance","Baking sheet","Trash can","Bread maker"} 1
{TV,"Cable TV",Wifi,"Air conditioning","Paid parking off premises",Elevator,Heating,Washer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Laptop-friendly workspace","Self check-in","Smart lock","Hot water","Bed linens","Ethernet connection",Refrigerator,"Long term stays allowed","Full kitchen","Paid parking on premises"} 1
{TV,"Cable TV",Wifi,"Air conditioning",Kitchen,"Smoking allowed","Pets allowed",Elevator,Heating,"Suitable for events",Washer,Dryer,"Smoke alarm","Carbon monoxide alarm","First aid kit","Fire extinguisher",Essentials,Shampoo,Hangers,"Hair dryer",Iron,"Private living room","Private entrance","Hot water","Bed linens","Extra pillows and blankets","Ethernet connection","Pocket wifi",Microwave,"Coffee maker",Refrigerator,Dishwasher,"Dishes and silverware","Cooking basics",Oven,Stove,"Garden or backyard"} 1
{TV,Wifi,"Air conditioning",Kitchen,"Smoking allowed",Elevator,"Free street parking",Heating,"Suitable for events",Washer,Dryer,"Smoke alarm",Essentials,Shampoo,Hangers,"Hair dryer","Laptop-friendly workspace","Self check-in","Building staff","Private entrance","Hot water","Bed linens","Long term stays allowed","Hot water kettle","Floor-to-ceiling window"} 1
Name: amenities, Length: 31651, dtype: int64
# Replace with empty space
with gzip.open('listings.csv.gz') as l:
LD = pd.read_csv(l)
LD['amenities'] = LD['amenities'].str.replace('[{}]', '').str.replace('""','')
LD.amenities.head(1)
0 TV,Wifi,"Air conditioning",Kitchen,"Free parki... Name: amenities, dtype: object
a1 = pd.Series(np.concatenate(LD['amenities'].map(lambda amn: amn.split(','))))\
.value_counts().head(10)
a1 = pd.DataFrame(a1)
a1.reset_index(inplace = True)
a1.rename(columns = {'index':'amenity', 0:'Count'}, inplace = True)
px.bar(a1,
x = 'amenity', y = 'Count',
color = 'amenity')
m4 = LD[['id', 'price', 'amenities']]
m4['price'] = m4['price'].str.replace('$', '').str.replace(',', '')
m4['price'] = pd.to_numeric(m4['price'])
# Find unique objects after splitting on comma
# denoted as 'a' in the lambda transform
am_items = np.unique(np.concatenate(m4['amenities'].map(lambda a: a.split(','))))
# Find prices corresponding to unique am_items, exclude empty strings
ap = [(a,
m4[m4['amenities'].map(lambda am: a in am)]['price'].mean()) for a in am_items if a !=""]
series = pd.Series(data = [a[1] for a in ap],
index = [a[0] for a in ap])
# top 10
series = series.sort_values(ascending = False)[:10]
series.head()
"Mountain view" 11155.000000 "Outdoor kitchen" 7393.666667 "Fire pit" 5851.000000 "Brick oven" 5761.000000 "Tennis court" 4737.400000 dtype: float64
series = pd.DataFrame(series)
series.reset_index(inplace = True)
series.rename(columns = {'index': 'amenity',
0: 'price'}, inplace = True)
series.head()
| amenity | price | |
|---|---|---|
| 0 | "Mountain view" | 11155.000000 |
| 1 | "Outdoor kitchen" | 7393.666667 |
| 2 | "Fire pit" | 5851.000000 |
| 3 | "Brick oven" | 5761.000000 |
| 4 | "Tennis court" | 4737.400000 |
series.head()
| amenity | price | |
|---|---|---|
| 0 | "Mountain view" | 11155.000000 |
| 1 | "Outdoor kitchen" | 7393.666667 |
| 2 | "Fire pit" | 5851.000000 |
| 3 | "Brick oven" | 5761.000000 |
| 4 | "Tennis court" | 4737.400000 |
px.bar(series,
x = 'amenity',
y = 'price',
color = 'amenity')